Our research questions about the strawberry production and chemical use in the US are:
i) Does the type of chemical have a correlation to the amount of strawberries produced (Measured in lb/ acre/ application)?
ii) Do different states use different chemicals?
iii) How do the amount of strawberries produced differ by state?
To answer these questions, we looked up two data sets, cleaned and merged them into one data set, and created graphs, tables and maps for Exploratory Data Analysis (EDA).
We used two data sets,USDA strawberry data and pesticide data. For the strawberry data set, the provided one in the class does not seem to be enough for our research in terms of information on each state production, so we looked up the data on the USDA web site again and got the data set which include in total 10215 observations of strawberry data in 10 states.
For the pesticide data, we adopted the provided data.
Finally, we combined these two data sets, deleted rows with no information, and selected the variables needed for EDA.
First we analyzed the data set making bar chart, violin and box plot, and density graph by chemical type to see if there is any difference in the amount of strawberry production between the two chemical types.
We also analyzed the data in terms of the amount of strawberry production by year and state where we make use of shiny application which allows us to explore strawberry production trends in each state and year more easily.
Finally we analyzed the data in terms of the amount of strawberry production by toxicity level.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `summarise()` has grouped output by 'State', 'chemical.type'. You can override using the `.groups` argument.
## `summarise()` has grouped output by 'State', 'chemical.type'. You can override using the `.groups` argument.